Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games
نویسندگان
چکیده
We consider the problem of finding stationary Nash equilibria (NE) in a finite discounted general-sum stochastic game. We first generalize a non-linear optimization problem from [9] to a general N player game setting. Next, we break down the optimization problem into simpler sub-problems that ensure there is no Bellman error for a given state and an agent. We then provide a characterization of solution points of these sub-problems that correspond to Nash equilibria of the underlying game and for this purpose, we derive a set of necessary and sufficient SG-SP (Stochastic Game Sub-Problem) conditions. Using these conditions, we develop two provably convergent algorithms. The first algorithm OFF-SGSP is centralized and model-based, i.e., it assumes complete information of the game. The second algorithm ON-SGSP is an online model-free algorithm. We establish that both algorithms converge, in self-play, to the equilibria of a certain ordinary differential equation (ODE), whose stable limit points coincide with stationary NE of the underlying general-sum stochastic game. On a single state non-generic game [12] as well as on a synthetic two-player game setup with 810, 000 states, we establish that ON-SGSP consistently outperforms NashQ [16] and FFQ [21] algorithms.
منابع مشابه
Algorithms for Nash Equilibria in General-Sum Stochastic Games
Over the past few decades the quest for algorithms to compute Nash equilibria in general-sum stochastic games has intensified and several important algorithms (cf. [9], [12], [16], [7]) have been proposed. However, they suffer from either lack of generality or are intractable for even medium sized problems or both. In this paper, we first formulate a non-linear optimization problem for stochast...
متن کاملRobust Learning for Repeated Stochastic Games via Meta-Gaming
This paper addresses learning in repeated stochastic games (RSGs) played against unknown associates. Learning in RSGs is extremely challenging due to their inherently large strategy spaces. Furthermore, these games typically have multiple (often infinite) equilibria, making attempts to solve them via equilibrium analysis and rationality assumptions wholly insufficient. As such, previous learnin...
متن کاملA Study of Gradient Descent Schemes for General-Sum Stochastic Games
Zero-sum stochastic games are easy to solve as they can be cast as simple Markov decision processes. This is however not the case with general-sum stochastic games. A fairly general optimization problem formulation is available for general-sum stochastic games by Filar and Vrieze [2004]. However, the optimization problem there has a non-linear objective and non-linear constraints with special s...
متن کاملLearning with Partial Observations in General-sum Stochastic Games
In many situations, multiagent systems must deal with partial observability that agents have in the environment. In these cases, finding optimal solutions is often intractable for more than two agents and approximated solutions are often the only way to solve these problems. The models known to represent this kind of problem is Partially Observable Stochastic Game (POSG). Such a model is usuall...
متن کاملStochastic Learning of Equilibria in Games: The Ordinary Differential Equation Method
Our purpose is to discuss stochastic algorithms to learn equilibria in games, and their time of convergence. To do so, we consider a general class of stochastic algorithms that converge weakly (in the sense of weak convergence for stochastic processes) towards solutions of particular ordinary differential equations, corresponding to their mean-field approximations. Tuning parameters in these al...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015